Approximations de l'Algorithme Itérations sur les Politiques Modifié
Identifieur interne : 001B24 ( Main/Exploration ); précédent : 001B23; suivant : 001B25Approximations de l'Algorithme Itérations sur les Politiques Modifié
Auteurs : Bruno Scherrer [France] ; Victor Gabillon [France] ; Mohammad Ghavamzadeh [France] ; Matthieu Geist [France]Source :
Abstract
Itérations sur les politiques modifié (MPI) est un algorithme de programmation dynamique qui généralise les deux algorithmes célèbres Itérations sur les valeurs (VI) et sur les politiques (PI). Malgré sa généralité, cet algorithme - et particulièremet sa mise en œuvre approchée qui est utilisée lorsque les espaces d'états/actions sont très grands - n'a pas encore été l'objet d'une analyse approfondie. Nous proposons ici trois implémentations approchées de MPI (AMPI) qui sont des extensions d'algorithmes de la littérature (Fitted Value Iteration, Fitted Q-Iteration et Classification Based Policy Iteration). Nous développons une analyse de la propagation d'erreur qui unifie celles développées indépendemment pour VI et PI dans la littérature. Nous fournissons enfin une analyse en échantillons finis pour le dernier algorithme basé sur un classifieur de politiques, qui est en quelque sorte le plus général. Une observation intéressante est que la paramètre principal de MPI permet de contrôler, dans la borne de performance, l'équilibre entre les erreurs dans le calcul des valeurs et celles dans l'estimation de la politique gourmande.
Url:
Affiliations:
- France
- Franche-Comté, Grand Est, Lorraine (région)
- Besançon, Metz, Nancy
- Université Paul Verlaine - Metz, Université de Bourgogne Franche-Comté, Université de Franche-Comté, Université de Lorraine
Links toward previous steps (curation, corpus...)
- to stream Hal, to step Corpus: 005721
- to stream Hal, to step Curation: 005721
- to stream Hal, to step Checkpoint: 001744
- to stream Main, to step Merge: 001B54
- to stream Main, to step Curation: 001B24
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="fr">Approximations de l'Algorithme Itérations sur les Politiques Modifié</title>
<author><name sortKey="Scherrer, Bruno" sort="Scherrer, Bruno" uniqKey="Scherrer B" first="Bruno" last="Scherrer">Bruno Scherrer</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-205124" status="OLD"><idno type="RNSR">200218290B</idno>
<orgName>Autonomous intelligent machine</orgName>
<orgName type="acronym">MAIA</orgName>
<date type="end">2014-12-31</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/maia</ref>
</desc>
<listRelation><relation active="#struct-129671" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-423090" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles><tutelle active="#struct-129671" type="direct"><org type="laboratory" xml:id="struct-129671" status="VALID"><idno type="RNSR">198618246Y</idno>
<orgName>INRIA Nancy - Grand Est</orgName>
<desc><address><addrLine>615 rue du Jardin Botanique 54600 Villers-lès-Nancy</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/nancy</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-423090" type="direct"><org type="department" xml:id="struct-423090" status="VALID"><orgName>Department of Complex Systems, Artificial Intelligence & Robotics</orgName>
<orgName type="acronym">LORIA - AIS</orgName>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/complex-system-and-artificial-intelligence</ref>
</desc>
<listRelation><relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect"><org type="laboratory" xml:id="struct-206040" status="VALID"><idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc><address><addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect"><org type="institution" xml:id="struct-413289" status="VALID"><idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc><address><addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
<author><name sortKey="Gabillon, Victor" sort="Gabillon, Victor" uniqKey="Gabillon V" first="Victor" last="Gabillon">Victor Gabillon</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-56016" status="OLD"><idno type="RNSR">200718281V</idno>
<orgName>Sequential Learning</orgName>
<orgName type="acronym">SEQUEL</orgName>
<date type="end">2014-12-31</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/sequel</ref>
</desc>
<listRelation><relation active="#struct-2546" type="direct"></relation>
<relation active="#struct-301700" type="indirect"></relation>
<relation active="#struct-92973" type="indirect"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation name="UMR8022" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-39658" type="direct"></relation>
<relation name="UMR8146" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-120930" type="indirect"></relation>
<relation active="#struct-104752" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-2546" type="direct"><org type="laboratory" xml:id="struct-2546" status="VALID"><orgName>Laboratoire d'Informatique Fondamentale de Lille</orgName>
<orgName type="acronym">LIFL</orgName>
<desc><address><addrLine>Bâtiment M3 59655 Villeneuve d'Ascq Cédex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.lifl.fr/</ref>
</desc>
<listRelation><relation active="#struct-301700" type="direct"></relation>
<relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-300009" type="direct"></relation>
<relation name="UMR8022" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301700" type="indirect"><org type="institution" xml:id="struct-301700" status="VALID"><idno type="IdRef">026404524</idno>
<idno type="ISNI">0000000121517701</idno>
<orgName>Université de Lille, Sciences Humaines et Sociales</orgName>
<desc><address><addrLine>Domaine universitaire du "Pont de Bois"Rue du Barreau BP 60149 59653 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille3.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-92973" type="indirect"><org type="institution" xml:id="struct-92973" status="VALID"><idno type="IdRef">026404184</idno>
<orgName>Université de Lille, Sciences et Technologies</orgName>
<desc><address><addrLine>Cité Scientifique - 59655 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR8022" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-39658" type="direct"><org type="laboratory" xml:id="struct-39658" status="OLD"><orgName>Laboratoire d'Automatique, Génie Informatique et Signal</orgName>
<orgName type="acronym">LAGIS</orgName>
<desc><address><addrLine>LAGIS Ecole Centrale de Lille Cité Scientifique 59655 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://lagis.ec-lille.fr/</ref>
</desc>
<listRelation><relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-120930" type="direct"></relation>
<relation name="UMR8146" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle name="UMR8146" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-120930" type="indirect"><org type="institution" xml:id="struct-120930" status="VALID"><orgName>Ecole Centrale de Lille</orgName>
<desc><address><addrLine>Cité Scientifique - CS 20048 59651 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lille.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-104752" type="direct"><org type="laboratory" xml:id="struct-104752" status="VALID"><idno type="RNSR">200818245B</idno>
<orgName>INRIA Lille - Nord Europe</orgName>
<desc><address><addrLine>Parc Scientifique de la Haute Borne 40, avenue Halley Bât.A, Park Plaza 59650 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/lille/</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Ghavamzadeh, Mohammad" sort="Ghavamzadeh, Mohammad" uniqKey="Ghavamzadeh M" first="Mohammad" last="Ghavamzadeh">Mohammad Ghavamzadeh</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-56016" status="OLD"><idno type="RNSR">200718281V</idno>
<orgName>Sequential Learning</orgName>
<orgName type="acronym">SEQUEL</orgName>
<date type="end">2014-12-31</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/sequel</ref>
</desc>
<listRelation><relation active="#struct-2546" type="direct"></relation>
<relation active="#struct-301700" type="indirect"></relation>
<relation active="#struct-92973" type="indirect"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation name="UMR8022" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-39658" type="direct"></relation>
<relation name="UMR8146" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-120930" type="indirect"></relation>
<relation active="#struct-104752" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-2546" type="direct"><org type="laboratory" xml:id="struct-2546" status="VALID"><orgName>Laboratoire d'Informatique Fondamentale de Lille</orgName>
<orgName type="acronym">LIFL</orgName>
<desc><address><addrLine>Bâtiment M3 59655 Villeneuve d'Ascq Cédex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.lifl.fr/</ref>
</desc>
<listRelation><relation active="#struct-301700" type="direct"></relation>
<relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-300009" type="direct"></relation>
<relation name="UMR8022" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301700" type="indirect"><org type="institution" xml:id="struct-301700" status="VALID"><idno type="IdRef">026404524</idno>
<idno type="ISNI">0000000121517701</idno>
<orgName>Université de Lille, Sciences Humaines et Sociales</orgName>
<desc><address><addrLine>Domaine universitaire du "Pont de Bois"Rue du Barreau BP 60149 59653 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille3.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-92973" type="indirect"><org type="institution" xml:id="struct-92973" status="VALID"><idno type="IdRef">026404184</idno>
<orgName>Université de Lille, Sciences et Technologies</orgName>
<desc><address><addrLine>Cité Scientifique - 59655 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR8022" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-39658" type="direct"><org type="laboratory" xml:id="struct-39658" status="OLD"><orgName>Laboratoire d'Automatique, Génie Informatique et Signal</orgName>
<orgName type="acronym">LAGIS</orgName>
<desc><address><addrLine>LAGIS Ecole Centrale de Lille Cité Scientifique 59655 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://lagis.ec-lille.fr/</ref>
</desc>
<listRelation><relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-120930" type="direct"></relation>
<relation name="UMR8146" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle name="UMR8146" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-120930" type="indirect"><org type="institution" xml:id="struct-120930" status="VALID"><orgName>Ecole Centrale de Lille</orgName>
<desc><address><addrLine>Cité Scientifique - CS 20048 59651 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lille.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-104752" type="direct"><org type="laboratory" xml:id="struct-104752" status="VALID"><idno type="RNSR">200818245B</idno>
<orgName>INRIA Lille - Nord Europe</orgName>
<desc><address><addrLine>Parc Scientifique de la Haute Borne 40, avenue Halley Bât.A, Park Plaza 59650 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/lille/</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Geist, Matthieu" sort="Geist, Matthieu" uniqKey="Geist M" first="Matthieu" last="Geist">Matthieu Geist</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING"><orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
<listRelation><relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-24541" type="direct"><org type="laboratory" xml:id="struct-24541" status="VALID"><idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc><address><addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation><relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect"><org type="institution" xml:id="struct-242365" status="VALID"><idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect"><org type="institution" xml:id="struct-411575" status="VALID"><orgName>CentraleSupélec</orgName>
<desc><address><addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect"><org type="institution" xml:id="struct-301991" status="VALID"><orgName>Georgia Tech Lorraine</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect"><org type="institution" xml:id="struct-301990" status="VALID"><orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc><address><addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect"><org type="institution" xml:id="struct-300812" status="VALID"><orgName>SUPELEC</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect"><org type="institution" xml:id="struct-300413" status="VALID"><orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect"><org type="institution" xml:id="struct-300289" status="OLD"><orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct"><org type="laboratory" xml:id="struct-26305" status="VALID"><orgName>SUPELEC-Campus Metz</orgName>
<desc><address><addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation><relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName><settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-00736226</idno>
<idno type="halId">hal-00736226</idno>
<idno type="halUri">https://hal.inria.fr/hal-00736226</idno>
<idno type="url">https://hal.inria.fr/hal-00736226</idno>
<date when="2012-05-22">2012-05-22</date>
<idno type="wicri:Area/Hal/Corpus">005721</idno>
<idno type="wicri:Area/Hal/Curation">005721</idno>
<idno type="wicri:Area/Hal/Checkpoint">001744</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">001744</idno>
<idno type="wicri:Area/Main/Merge">001B54</idno>
<idno type="wicri:Area/Main/Curation">001B24</idno>
<idno type="wicri:Area/Main/Exploration">001B24</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="fr">Approximations de l'Algorithme Itérations sur les Politiques Modifié</title>
<author><name sortKey="Scherrer, Bruno" sort="Scherrer, Bruno" uniqKey="Scherrer B" first="Bruno" last="Scherrer">Bruno Scherrer</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-205124" status="OLD"><idno type="RNSR">200218290B</idno>
<orgName>Autonomous intelligent machine</orgName>
<orgName type="acronym">MAIA</orgName>
<date type="end">2014-12-31</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/maia</ref>
</desc>
<listRelation><relation active="#struct-129671" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-423090" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles><tutelle active="#struct-129671" type="direct"><org type="laboratory" xml:id="struct-129671" status="VALID"><idno type="RNSR">198618246Y</idno>
<orgName>INRIA Nancy - Grand Est</orgName>
<desc><address><addrLine>615 rue du Jardin Botanique 54600 Villers-lès-Nancy</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/nancy</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-423090" type="direct"><org type="department" xml:id="struct-423090" status="VALID"><orgName>Department of Complex Systems, Artificial Intelligence & Robotics</orgName>
<orgName type="acronym">LORIA - AIS</orgName>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/complex-system-and-artificial-intelligence</ref>
</desc>
<listRelation><relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect"><org type="laboratory" xml:id="struct-206040" status="VALID"><idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc><address><addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect"><org type="institution" xml:id="struct-413289" status="VALID"><idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc><address><addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
<author><name sortKey="Gabillon, Victor" sort="Gabillon, Victor" uniqKey="Gabillon V" first="Victor" last="Gabillon">Victor Gabillon</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-56016" status="OLD"><idno type="RNSR">200718281V</idno>
<orgName>Sequential Learning</orgName>
<orgName type="acronym">SEQUEL</orgName>
<date type="end">2014-12-31</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/sequel</ref>
</desc>
<listRelation><relation active="#struct-2546" type="direct"></relation>
<relation active="#struct-301700" type="indirect"></relation>
<relation active="#struct-92973" type="indirect"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation name="UMR8022" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-39658" type="direct"></relation>
<relation name="UMR8146" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-120930" type="indirect"></relation>
<relation active="#struct-104752" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-2546" type="direct"><org type="laboratory" xml:id="struct-2546" status="VALID"><orgName>Laboratoire d'Informatique Fondamentale de Lille</orgName>
<orgName type="acronym">LIFL</orgName>
<desc><address><addrLine>Bâtiment M3 59655 Villeneuve d'Ascq Cédex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.lifl.fr/</ref>
</desc>
<listRelation><relation active="#struct-301700" type="direct"></relation>
<relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-300009" type="direct"></relation>
<relation name="UMR8022" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301700" type="indirect"><org type="institution" xml:id="struct-301700" status="VALID"><idno type="IdRef">026404524</idno>
<idno type="ISNI">0000000121517701</idno>
<orgName>Université de Lille, Sciences Humaines et Sociales</orgName>
<desc><address><addrLine>Domaine universitaire du "Pont de Bois"Rue du Barreau BP 60149 59653 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille3.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-92973" type="indirect"><org type="institution" xml:id="struct-92973" status="VALID"><idno type="IdRef">026404184</idno>
<orgName>Université de Lille, Sciences et Technologies</orgName>
<desc><address><addrLine>Cité Scientifique - 59655 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR8022" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-39658" type="direct"><org type="laboratory" xml:id="struct-39658" status="OLD"><orgName>Laboratoire d'Automatique, Génie Informatique et Signal</orgName>
<orgName type="acronym">LAGIS</orgName>
<desc><address><addrLine>LAGIS Ecole Centrale de Lille Cité Scientifique 59655 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://lagis.ec-lille.fr/</ref>
</desc>
<listRelation><relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-120930" type="direct"></relation>
<relation name="UMR8146" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle name="UMR8146" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-120930" type="indirect"><org type="institution" xml:id="struct-120930" status="VALID"><orgName>Ecole Centrale de Lille</orgName>
<desc><address><addrLine>Cité Scientifique - CS 20048 59651 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lille.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-104752" type="direct"><org type="laboratory" xml:id="struct-104752" status="VALID"><idno type="RNSR">200818245B</idno>
<orgName>INRIA Lille - Nord Europe</orgName>
<desc><address><addrLine>Parc Scientifique de la Haute Borne 40, avenue Halley Bât.A, Park Plaza 59650 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/lille/</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Ghavamzadeh, Mohammad" sort="Ghavamzadeh, Mohammad" uniqKey="Ghavamzadeh M" first="Mohammad" last="Ghavamzadeh">Mohammad Ghavamzadeh</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-56016" status="OLD"><idno type="RNSR">200718281V</idno>
<orgName>Sequential Learning</orgName>
<orgName type="acronym">SEQUEL</orgName>
<date type="end">2014-12-31</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/equipes/sequel</ref>
</desc>
<listRelation><relation active="#struct-2546" type="direct"></relation>
<relation active="#struct-301700" type="indirect"></relation>
<relation active="#struct-92973" type="indirect"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation name="UMR8022" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-39658" type="direct"></relation>
<relation name="UMR8146" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-120930" type="indirect"></relation>
<relation active="#struct-104752" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-2546" type="direct"><org type="laboratory" xml:id="struct-2546" status="VALID"><orgName>Laboratoire d'Informatique Fondamentale de Lille</orgName>
<orgName type="acronym">LIFL</orgName>
<desc><address><addrLine>Bâtiment M3 59655 Villeneuve d'Ascq Cédex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.lifl.fr/</ref>
</desc>
<listRelation><relation active="#struct-301700" type="direct"></relation>
<relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-300009" type="direct"></relation>
<relation name="UMR8022" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-301700" type="indirect"><org type="institution" xml:id="struct-301700" status="VALID"><idno type="IdRef">026404524</idno>
<idno type="ISNI">0000000121517701</idno>
<orgName>Université de Lille, Sciences Humaines et Sociales</orgName>
<desc><address><addrLine>Domaine universitaire du "Pont de Bois"Rue du Barreau BP 60149 59653 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille3.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-92973" type="indirect"><org type="institution" xml:id="struct-92973" status="VALID"><idno type="IdRef">026404184</idno>
<orgName>Université de Lille, Sciences et Technologies</orgName>
<desc><address><addrLine>Cité Scientifique - 59655 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lille1.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect"><org type="institution" xml:id="struct-300009" status="VALID"><orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc><address><addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR8022" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-39658" type="direct"><org type="laboratory" xml:id="struct-39658" status="OLD"><orgName>Laboratoire d'Automatique, Génie Informatique et Signal</orgName>
<orgName type="acronym">LAGIS</orgName>
<desc><address><addrLine>LAGIS Ecole Centrale de Lille Cité Scientifique 59655 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://lagis.ec-lille.fr/</ref>
</desc>
<listRelation><relation active="#struct-92973" type="direct"></relation>
<relation active="#struct-120930" type="direct"></relation>
<relation name="UMR8146" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle name="UMR8146" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-120930" type="indirect"><org type="institution" xml:id="struct-120930" status="VALID"><orgName>Ecole Centrale de Lille</orgName>
<desc><address><addrLine>Cité Scientifique - CS 20048 59651 Villeneuve d'Ascq Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ec-lille.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-104752" type="direct"><org type="laboratory" xml:id="struct-104752" status="VALID"><idno type="RNSR">200818245B</idno>
<orgName>INRIA Lille - Nord Europe</orgName>
<desc><address><addrLine>Parc Scientifique de la Haute Borne 40, avenue Halley Bât.A, Park Plaza 59650 Villeneuve d'Ascq</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/lille/</ref>
</desc>
<listRelation><relation active="#struct-300009" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author><name sortKey="Geist, Matthieu" sort="Geist, Matthieu" uniqKey="Geist M" first="Matthieu" last="Geist">Matthieu Geist</name>
<affiliation wicri:level="1"><hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING"><orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
<listRelation><relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles><tutelle active="#struct-24541" type="direct"><org type="laboratory" xml:id="struct-24541" status="VALID"><idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc><address><addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation><relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect"><org type="institution" xml:id="struct-242365" status="VALID"><idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect"><org type="institution" xml:id="struct-411575" status="VALID"><orgName>CentraleSupélec</orgName>
<desc><address><addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect"><org type="institution" xml:id="struct-301991" status="VALID"><orgName>Georgia Tech Lorraine</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect"><org type="institution" xml:id="struct-301990" status="VALID"><orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc><address><addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect"><org type="institution" xml:id="struct-300812" status="VALID"><orgName>SUPELEC</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect"><org type="institution" xml:id="struct-300413" status="VALID"><orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect"><org type="institution" xml:id="struct-300289" status="OLD"><orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc><address><country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect"><org type="institution" xml:id="struct-441569" status="VALID"><idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc><address><country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct"><org type="laboratory" xml:id="struct-26305" status="VALID"><orgName>SUPELEC-Campus Metz</orgName>
<desc><address><addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation><relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName><settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName><settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="fr">Itérations sur les politiques modifié (MPI) est un algorithme de programmation dynamique qui généralise les deux algorithmes célèbres Itérations sur les valeurs (VI) et sur les politiques (PI). Malgré sa généralité, cet algorithme - et particulièremet sa mise en œuvre approchée qui est utilisée lorsque les espaces d'états/actions sont très grands - n'a pas encore été l'objet d'une analyse approfondie. Nous proposons ici trois implémentations approchées de MPI (AMPI) qui sont des extensions d'algorithmes de la littérature (Fitted Value Iteration, Fitted Q-Iteration et Classification Based Policy Iteration). Nous développons une analyse de la propagation d'erreur qui unifie celles développées indépendemment pour VI et PI dans la littérature. Nous fournissons enfin une analyse en échantillons finis pour le dernier algorithme basé sur un classifieur de politiques, qui est en quelque sorte le plus général. Une observation intéressante est que la paramètre principal de MPI permet de contrôler, dans la borne de performance, l'équilibre entre les erreurs dans le calcul des valeurs et celles dans l'estimation de la politique gourmande.</div>
</front>
</TEI>
<affiliations><list><country><li>France</li>
</country>
<region><li>Franche-Comté</li>
<li>Grand Est</li>
<li>Lorraine (région)</li>
</region>
<settlement><li>Besançon</li>
<li>Metz</li>
<li>Nancy</li>
</settlement>
<orgName><li>Université Paul Verlaine - Metz</li>
<li>Université de Bourgogne Franche-Comté</li>
<li>Université de Franche-Comté</li>
<li>Université de Lorraine</li>
</orgName>
</list>
<tree><country name="France"><region name="Grand Est"><name sortKey="Scherrer, Bruno" sort="Scherrer, Bruno" uniqKey="Scherrer B" first="Bruno" last="Scherrer">Bruno Scherrer</name>
</region>
<name sortKey="Gabillon, Victor" sort="Gabillon, Victor" uniqKey="Gabillon V" first="Victor" last="Gabillon">Victor Gabillon</name>
<name sortKey="Geist, Matthieu" sort="Geist, Matthieu" uniqKey="Geist M" first="Matthieu" last="Geist">Matthieu Geist</name>
<name sortKey="Ghavamzadeh, Mohammad" sort="Ghavamzadeh, Mohammad" uniqKey="Ghavamzadeh M" first="Mohammad" last="Ghavamzadeh">Mohammad Ghavamzadeh</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001B24 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001B24 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Lorraine |area= InforLorV4 |flux= Main |étape= Exploration |type= RBID |clé= Hal:hal-00736226 |texte= Approximations de l'Algorithme Itérations sur les Politiques Modifié }}
This area was generated with Dilib version V0.6.33. |